SizingParametersWhen sizing an Orbeon Forms application, or considering whether you need to improve the performance of your application, you need to take into account the following parameters:
Number of cores neededRoughly, the number of CPU cores you will need will be:C = (PT / PF + AT / AF) * AU This is of course an approximation, and assumes that if for a single user, a CPU core take 100 ms to respond, under load it will be able to handle 10 requests per seconds. The actual number of requests per second per core can be slightly higher (e.g. thanks to hyper-threading on modern processors) or slightly lower if there is contention. Note that excessive contention, i.e. if the number of requests handled on a given server under load is significantly lower than 1 divided by the time the server takes to handle one request for a single user, then take this as a sign that you need to improve the system configuration to avoid contention. Examples
Most likely, your application sit somewhere between the first case (very large forms: lots of time spent on the form, few pages loaded) and the second case (simple forms: very little time spent on the form, lots of page loads). The significant difference is in what contributes to the load. In the first case, 92% of the load comes from Ajax requests, while in the second case only 44% of the load comes from Ajax requests. With this knowledge, if an optimization is needed, you can determine whether page loads or Ajax requests need more of your attention. Tuning the Java Virtual Machine (JVM)Set -Xms and -Xmx to the same valueThe heap is a section of memory used by the JVM to store Java objects. You can set constraints on the size of the heap with two parameters passed to the JVM at startup: -Xms sets the initial size of the heap and -Xmx sets the maximum size of the heap. If you set those two parameters to different values, say 512MB and 1GB, you tell the JVM to start with a heap size of 512MB and increase the size up to 1GB "as necessary". In this case, the JVM has to balance two conflicting constraints: not request too much memory from the operating system (getting too fast to 1GB), and not request too little as it would increase the amount of time the to spends on garbage collection which would will reduce the performance of your application.Asking the JVM to balance memory usage and performance by setting different values for -Xms and -Xmx is very reasonable for desktop applications where a number of applications are running and competing for resources, in particular memory. However, in a server environment you often have one or two major applications running on the server, like the JVM for your application server and maybe a database server. In this situation you have more control over how much memory can be used by each application, and we recommend you set both -Xms and -Xmx to the same value. Allocate a large heap but don't cause swappingThe larger the heap, the faster your application will be get. This for two reasons: first, the JVM garbage collector works more efficiently with a larger heap, and second, this enables you to increase the size of the Orbeon Forms cache (more on this later) which will also improve the performance of your application. However, don't use a heap size so large that it would cause swapping, as this would then drastically reduce the performance of your application.We recommend that you first set the heap size based on how much memory the server has and what other major applications are running. Say you have 2GB of physical memory, and no other major application: then you could set the heap to 1.5 GB, which leaves 512 MB to the operating system and minor applications. Say you have 4 GB of physical memory and also a database running on the same server, then you can set the heap size to 2 GB, assign 1.5 GB to the database server, and leave 512 MB to the operating and minor applications. Then, with a "reasonable" setting in place, monitor the server under normal load and look if the machine is swapping or if on the contrary the operating system is reporting a lot of available memory. In the first case, reduce the heap size. In the second, increase it. Permgen errorsIf you encounter JVM permgen errors:
Tuning the stack[ADDED: 2011-11-14] The maximum stack size per thread is not dynamic in the Sun/Oracle JVMs as of 2011-11. This means that code with deep call stacks, for example code using complex XPL and/or XSLT, can cause errors. You might get plain java.lang.StackOverflowError errors, or indirect errors such as the following XSLT exception:
The latter may indicate infinite recursion, but it can also indicate that the JVM stack is simply too small. This is what Oracle says about the default size of the stack:
In such cases, you can try increasing the stack size from the current or default value, for example with the following JVM parameter:
The value to specify should be something higher than what your JVM is running when causing the error. The drawback of increasing the stack size is that threads require more memory. That might not be a big issue unless you are running hundreds or thousands of threads, which is not typical. Tuning the application serverWebLogic: Disable automatic redeploymentWebLogic can be configured to check on a regular basis if the files of your application have changed on disk, and redeploy the application if they did. Redeploying at application server level is useful when you change the JAR files or some of the underlying configuration files, like the web.xml. As checking if files have changed is incredibly time consuming with WebLogic, and as you are pretty unlikely to change any of those files on a regular basis, we recommend you disable the automatic web application redeployment feature, which is enabled by default.To do this, after you have installed your Orbeon Forms application, stop WebLogic, and open the config.xml file in an editor. Look for the <WebAppComponent Name="orbeon"> element and add the attribute: ServletReloadCheckSecs="-1". Disable DNS lookupYou can configure your application server to perform a DNS lookup for every HTTP request. The server always know the IP address of the machine where the HTTP request originated. However, to get the name, the application server needs to send a DNS lookup query to the DNS server. In most cases, performing this query only has a negligible impact on performance. However, the request can take a significant amount of time in certain cases where the network from which the request originated is badly configured. In most case, the application server is doing DNS lookups for "aesthetic reasons": that is to able to in include in the logs the name of the client's machines, instead of their IP address (note that web analysis tools can usually do this reverse DNS lookup much more efficiently when analyzing log files subsequently, typically on a daily basis). So we recommend you change the configuration of your application server to disable DNS lookup, which is in general enabled by default.On Tomcat 5.5 (external documentation), look for the enableLookups attribute on the <Connector> element and set it to false. If the attribute is not present, add it and set it to false (the default value is true). Enable gzip compression for generated text contentHTML and XML content usually compresses extremely well using gzip. You can obtain sizes for the content sent by the server to the web browser that are up to 10 times smaller. A very complex XForms page taking 100 KB, for example, may end up taking only about 10 KB. This means that the data will reach the web browser up to 10 times faster than without gzip compression.Most web and application servers support gzip compression. For example, Tomcat 5.5 supports the attribute gzip on the <Connector> element. For more information, please see the Tomcat HTTP connector documentation. Reduce the number of concurrent processing threadsServlet containers like Tomcat, or application servers like WebLogic, by default allow a very large number of concurrent threads to enter a servlet. For Tomcat, the default is 200. This means that memory usage cannot be effectively bound. It is enough to have a few requests slightly overlapping to cause extra memory consumption that can lead to Out of Memory errors. In addition, extra memory usage leads to poorer performance. As a rule of thumb, you can set the maximum number of concurrent threads to be twice the number of cores available on the current machine (assuming this is the only Java VM and application you run on that machine). For instance, on a machine with 2 dual core CPUs, you would typically set the maximum number of concurrent threads to 8. If your application spends a significant amount of time waiting for external resources (like a database or REST/web services), then you might need to use a higher value.Tomcat configurationTomcat appears to accept only a minimum of 10 for themaxThreads configuration parameter. Hence Tomcat does not effectively allow you to reduce the number of concurrent processing threads to less than 10:
Apache configurationWhen doing load balancing with Apache's mod_proxy, you should use themax parameter to set the hard maximum
number of connection allowed to any given Tomcat. For instance, a
configuration that limits the number of concurrent connections to each
one of your back-end Tomcat server to 4 would look like:ProxyPass / balancer://mycluster/
stickysession=JSESSIONID|jsessionid <Proxy
balancer://mycluster> BalancerMember
ajp://192.168.0.100:8009 max=4 BalancerMember ajp ://192.168.0.101:8009
max=4
BalancerMember ajp ://192.168.0.102:8009
max=4
BalancerMember ajp ://192.168.0.103:8009
max=4 </Proxy> Unfortunately, with in Apache, the value of max is per process. So, you can only effectively
limit the number of connections to your back-end servers if Apache has
one process and uses threads instead of processes to handle multiple
connections, which depends on what MPM is being used
by Apache:
WebLogic configurationWith WebLogic, edit theWEB-INF/weblogic.xml , and add the following elements:<wl-dispatch-policy>OrbeonWorkManager</wl-dispatch-policy> <work-manager> <name>OrbeonWorkManager</name> <max-threads-constraint> <name>MaxThreadsConstraint</name> <count>4</count> </max-threads-constraint> </work-manager> The <work-manager> element defines a new work manager with a constraint of a maximum of 4 concurrent threads. Then the <wl-dispatch-policy> instructs WebLogic to use the work manager you defined for the current web application.Testing your configurationYou want to test your configuration to make sure it is effective. For this you issue concurrent requests to/xforms-sandbox/service/image-with-delay , which serves an image after a delay of 5 seconds. Use a tool like JMeter or the Apache HTTP Server Benchmarking Tool (ab ) to issue 10 requests at the same time. With ab , for instance, run:ab -n 10 -c 10 http://localhost/orbeon/xforms-sandbox/service/image-with-delay Assuming you set the maximum number of concurrent processing threads to 2, you will then see in the logs that the request are handle 2 by 2: INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5042 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5042 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5035 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5034 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5031 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5025 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5035 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5041 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Received request INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5031 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% INFO ProcessorService - /xforms-sandbox/service/image-with-delay - Timing: 5044 - Cache hits for cache.main: 289, fault: 1, adds: 0, expirations: 0, success rate: 99% If you are using JMeter, it will also show you that while all the requests have been sent at the same time, the results come back 2 by 2 at 5 seconds interval: Tuning Orbeon FormsWhere does the time go?Lots of different things can take time when loading an Orbeon Forms page, including:
When there is a performance issue, it's a good idea to start by trying to figure out where the time is spent. Here are some tools to help, including:
Reduce the number of Ajax requestsBy default, events produced by users' interactions with a form are sent by the browser to the server right after the interaction happens. Events are sent by the browser to the server through Ajax requests. A combination of large forms and high traffic can result in a high number of Ajax requests hitting your server, which in turn can impact your site performance.Filtering out specific events[Deprecated – This feature is deprecated as of Orbeon Forms 4.5, as it is incompatible with forms created with Form Builder or executed by Form Runner.] Filtering out the xforms-focus event will:
oxf.xforms.client.events.filter property. Its value is a space-separated list of event names to be filtered. By default that list is empty. To filter the xforms-focus event, set it to:<property as="xs:string" name="oxf.xforms.client.events.filter" value="xforms-focus"/> <xf:model> element, i.e. <xf:model ... xxforms:client.events.filter="xforms-focus">. Indicate for what controls events should be sent to the serverA more aggressive step consists in setting the client events mode to differed. By default, every time users change a value and tab to another field, an Ajax event is sent to server. Those events are useful is something else in the form can change based on the new value; for instance, you might enable a set of fields when users click on a checkbox. But for some forms, most of the values entered by users have no impact on the rest of the form. If this is the case for your form, and you notice that the load from Ajax requests impacts performance, then:
Tune the Orbeon Forms cache sizeOne way to increase the performance of your application is to increase the size of the Orbeon Forms cache. You setup the size of the Orbeon Forms cache with the oxf.cache.size property. Due to limitations of the JVM, you cannot set the size of the Orbeon Forms cache in MB. Instead, the value you specify the maximum number of objects that Orbeon Forms can store in cache. As the size of each object stored in cache is different and the average size of those objects can change widely depending on your application, we can't give you an equivalence between number of objects and memory used. Instead, we recommend you follow the suggestions below to tune your Orbeon Forms cache size.
Disable Saxon source location Make sure you are not overriding the following property in your properties-local.xml (the default is none which is what you want to have optimal efficiency):<property as="xs:string" processor-name="oxf:builtin-saxon" name="location-mode" value="none"/> <property as="xs:string" processor-name="oxf:unsafe-builtin-saxon" name="location-mode" value="none"/> <!-- This property was used prior to January 2010 builds --> <property as="xs:string" processor-name="oxf:saxon8" name="location-mode" value="none"/> smart during development in order to obtain better line number information. But keep in mind that this has a performance impact. If you have changed this property for development, make sure to set it back to none when testing and deploying your application.Tuning your applicationReduce the size of your XML documentsWith XML, it is very easy to add data to an existing document and then extract just the data you need from that document. This creates a tendency for the size of the documents manipulated by your application to grow as you progress on the development of your application. Who has never said "let's just add this information to this existing document", or "let's keep this information in the document and pass it around; you never know, we might need it in the future". While this might be just fine in some cases, you need to make sure that the size of your documents does not increase to the point where performance is impacted. If you uncover a performance issue, you should check the size of the documents you manipulate and reduce it when possible.If you need to be further convinced, consider an application where pages are generated based on some information contained in an XML schema. This XML schema is stored in an XML database and takes about a 100 KB or 4000 lines when serialized. Because data contained in the file is needed in multiple locations, the file is passed around in a number of pipelines while generating a page, and is used overall as input to 10 processors. Each processor will create its own representation of the data in memory, which can take 10 times the size of the serialized XML. That means that each processor has to allocate 1 MB of objects and do some processing one those objects. At the end of the request, 10 MB of memory have been allocated to process this data, and the garbage collector will eventually have to spend CPU cycles on freeing this memory. What if out of the 4000 lines, only 400 are actually used? Starting by extracting those 400 lines and then passing only those to the processors means that the processors now need to do only one tenth of the work they were doing before. Clearly this type of modification can drastically improve performance. If you are required to work with large documents, also consider using an XML database such as the open source eXist database, and delegate complex queries to the database: this should be more efficient than continually retrieving large XML documents and processing them in Orbeon Forms. Tune your XSLT codeSome operations in XSLT or XPath can be very expensive. This in particular the case for XPath expressions where the engine has to go through the whole input document to evaluate the expression. Consider this XPath expression: //person. To evaluate it, the engine iterates over every element in the document looking for a <person>. If you know that given the structure of the document, a person is inside a department, which is in a company, you can rewrite this as /company/department/person, which will typically run more efficiently.If you can, also try to avoid running many XSLT transformations. In particular, you may be able to avoid running a theme stylesheet entirely. See also Customize the standard epilogue. Enable XPL profilingXPL profiling has been introduced with Orbeon Forms 3.0 to give you detailed information on how much time is being spent in each processor involved to generate a web page. When enabled, for each HTTP request, the XPL profiler will output a tree with all the processor calls. Each node of the tree represents the execution of a processor. When a processor starts other processors, those are represented as child nodes in the tree. With each node, the profiler outputs 2 numbers: the first is for the time spent specifically by this processor; the second is for the time spent cumulatively by this processor and all its children; both are inBy default, XPL profiling is disabled. To enable XPL profiling, configure processor.trace, processor.trace.host, and processor.trace.port as described in the properties documentation. Customize the standard epilogueThe standard Orbeon Forms epilogue can be optimized for your own needs. For example:
<property as="xs:boolean" name="oxf.epilogue.use-theme" value="false"/> Make sure you remove all your XPL debug attributes While using debug attributes is one of the best ways to debug XPL pipelines, those also have an impact on performance as they locally disable XPL caching and also require time to serialize XML documents to your logger. For performance testing and production, always remove all the debug attributes.Don't serve your static files through Orbeon FormsIt is overkill to serve static files such as static images through Orbeon Forms. Instead, use your servlet container's facilities for serving static files, or even better, use a simple web server such as Apache Server.Delay expensive submissionsThere are times when you need to perform an expensive call to your backend to load data which is shown on your form. Typically, you do this by running an <xforms:submission> on xforms-model-construct-done. If running that submission is really expensive (say, taking seconds), you might want to consider serving your form to the browser without that data, and loading it through Ajax as soon as the form displayed in the browser. In essence, you will:
<xforms:action ev:event="my-load-initial-data"> <xforms:send submission="my-expensive-submission"/> </xforms:action> Use xforms:instance to load dynamic instancesThere are several ways of initialization XForms instances. For instances whose content is generated dynamically, run a submission on xforms-model-construct-done to load the instance instead of using XSLT or XInclude. This helps ensures that the source XForms page is cacheable.Perform initialization on xforms-model-construct-doneIf you have initialization tasks to perform upon page initialization, try using xforms-model-construct-done instead of xforms-ready. This will cause less updates to controls and may yield better performance. See also the XForms reference for details.Tuning the Orbeon XForms engine Enable minimal, combined, versioned resources
For more information on this, see XForms - JavaScript and CSS Resources. Consider using read-only and cached instances Some XForms instances that never change or that are simply replaced during a submission can be set as read-only, and can also optionally be cached between pages and even applications. Such instances take less memory and are more efficient to build. However they cannot hold data that changes over time and they cannot use XForms Model Item Properties (MIPs).For more information on this, see XForms - Performance settings. Consider using asynchronous submissions[TODO] Consider not refreshing sets of items on selection controls The xxforms:refresh-items attribute on selection controls allows for performance improvements in certain cases. See Controlling Item Sets Refreshes with xxforms:refresh-items.For more information on this, see XForms - Performance settings. Use server-side XForms state handlingThe XForms engine state can be either stored on the server, or sent to the client and exchanged with the server with each request. By default, the XForms state is stored on the server. Make sure this is enabled.Tune XForms caches[TODO]Control Ajax updatesCheck whether your XForms document is cacheableThe XForms engine performs some analysis (called static analysis) on XForms pages before rendering them. This includes:
If the analysis can be stored in cache, performance is typically enhanced. Therefore it is important to ensure that caching occurs. Here is how you can check whether a given XForms document can cache its static analysis:
NOTE: If your XForms page is generated from XSLT or JSP, and you insert changing inline instance data into it, then it is likely not cacheable. It is a good idea to test the scenarios above also from multiple users (if your application handles multiple users), in order to make sure that a change in user keeps caching active. Making your XForms document is cacheableThe key to make the XForms document cacheable is to make sure the document that is fed to the XForms processor "doesn't change" between requests. In Orbeon Forms, "change" usually means making changes to a file or one of its dependencies. The absolute safest way to make the document cacheable is to keep it as a single static file on disk! But this is not the only way. The document will also be cacheable if it depends on other documents which themselves are cacheable. For example, XSLT and XInclude transformations support this and in general allow caching of their resulting document, if the XSLT or XInclude document doesn't change and if all their dependencies don't change either. NOTE: There are exceptions, like using the doc() or document() functions in XSLT with with a dynamic parameter. In this case, caching is not possible.More generally content that is provided with key/validity information that doesn't change (for example, the oxf:scope-generator processor) is also cacheable.The general rule is the following:
For example:
There are cases where the above can be done (for example if the data is made cacheable and rarely changes), but you have to be very careful about it. For dynamic aspects in XForms, you can instead use instance/@src , submissions, functions like xxforms:get-request-parameter() , etc.NOTE: As of 2010-08, this behavior is improved: the static analysis is cached based on a digest of XForms elements. This means that, even if the entire input document is not cacheable, the static state analysis might still be cacheable. See how to check log files above to check which kind of caching occurs. See also Load initial form data. [TODO: explain how to load instance data efficiently] Other recommendationsUse a performance analysis toolTo obtain your numbers, use a tool such as Apache JMeter. Be sure to warm up your Java VM first and to let the tool run for a significant number of sample before recording your performance numbers.If you feel comfortable with the source code of Orbeon Forms, you can also use a Java profiler such as YourKit to figure out if a particular part of the Orbeon Forms platform is a bottleneck. Make sure of what you are measuringIf you are testing the performance of an application that talks to a database or backend services, be sure to be able to determine how much time your front-end or presentation layer, versus your backend and data layers, are respectively taking. |