Scale test report for very large scale document repositories (SharePoint Server 2010)

 

Applies to: SharePoint Server 2010

This article describes the results of large scale testing of Microsoft SharePoint Server 2010 that was performed at Microsoft. The results of the testing were used to publish requirements for scaling document archive repositories on SharePoint Server to a large storage capacity. The tests involved creating a large number of typical documents that have an average size of 256 KB, loading the documents into a SharePoint Server farm, creating a Microsoft FAST Search Server 2010 for SharePoint index on the documents, and then running tests with Microsoft Visual Studio 2010 Ultimate to simulate usage.

This testing demonstrates both scale-up and scale-out techniques. Scale-up refers to using additional hardware capacity to increase resources of a single environment, which for these purposes means a SharePoint Server content database. A SharePoint Server content database contains all site collections, all metadata, and binary large objects (BLOBs) associated with those site collections that are accessed by SharePoint Server. Scale-out refers to having multiple environments, which for these purposes means using multiple SharePoint Server content databases. A content database consists of a SQL Server database, various configuration data, and any document BLOBs regardless of where the BLOBs are stored.

The workload that was tested for this report is primarily about document archive. This includes a large number of typical Microsoft Office documents that are stored for archival purposes. Storage in this scenario is typically for the long term with infrequent access.

In this article:

  • Test farm characteristics

  • Test Farm Hardware Architecture Details

  • Test Farm SharePoint Server and SQL Server Architecture

  • The Method, Project Timeline and Process for Building the Farm

  • ResultsTesting

  • Conclusions

  • Recommendations

Test farm characteristics

This section describes the dataset, workloads, hardware settings, topology, and test definitions that were used during the large scale SharePoint Server testing.

Definition of tested workload

This load test was developed to show large document archive capabilities of SharePoint Server 2010. The document archive workload is characterized by having a large number of documents that are added to, or ingested, slowly, infrequently accessed and almost never updated.

Working with Large Document Archives

Large document archive capabilities

Document archive scale-out architecture

Content routing is recommended for a SharePoint Server 2010 farm that has multiple content databases. Content routing sends documents to the correct content database from an initial drop library. In the tests that are described in this report, content routing was not configured, and the focuses of the tests were scalability and performance of the installation.

While content routing is used to ingest documents into one of multiple SharePoint Server 2010 content databases, Microsoft FAST Search Server 2010 for SharePoint can be used to optimally locate a document in one or more content databases. Microsoft FAST Search Server 2010 for SharePoint builds an index with all documents from all content databases. Searches can use metadata, refiners for selecting by date, author, or other properties. Full text searches can also be performed.

Workload

This article presents the results of a series of performance tests that were conducted on SharePoint Server 2010 and Microsoft FAST Search Server 2010 for SharePoint in a document archive scenario. This section includes an explanation of the testing methodology that was used for tests. Deviations from this methodology are noted where data is presented.

Important

The specific capacity and performance figures presented in this article differ from the figures in real-world environments. The figures that are presented are intended to provide a starting point for the design of an appropriately scaled environment. After you complete your initial system design, test the configuration to determine whether your system will support the factors in your environment.

Testing workloads were designed according to a large document archive storage scenario and are intended to help develop estimates of how different farm configurations are affected by using a large-scale document repository.

The test farm depicted in this scenario was built to allow scale out and scale up to accommodate additional capacity as required.

The ability to scale out or scale up is as important for small-scale implementations as it is for large-scale document archive scenarios. Scaling out allows you to add more servers to your farm (or farms), such as additional front end Web servers or Application Servers. By scaling up you increase the capacity of your existing servers by adding faster CPUs or memory, or both, to increase throughput and performance. Content routing should also be leveraged in archive scenarios so that users can drop a file and have it dynamically routed to the proper document library and folder, if applicable, based on the metadata of the file.

Test transaction definitions and baseline settings

This section defines the test transactions and other baseline settings, and provides an overview of the test process that was used for each scenario. Detailed information such as test results and specific parameters are given in later sections in this article.

Baseline Item Baseline Item Description Baseline Setting (or Transaction Percent)

Document Upload

Upload a document to one of the Document Centers. One unique folder and file were created in each Document Center each hour, 24 hours a day.

1%

Document Download (Open)

Download or open a document

30%

Browse

Access of a random Document Center home page, Document Library List view page, or Folder list view page.

40%

Search

A random search query submitted to the FAST Search Center.

30%

Think Time

The time between transactions for each user. This represents the time that a user spends reading or thinking between accesses to Web pages.

10 seconds

Concurrent Users

The number of users who are connecting to the SharePoint Server farm from test agents to the SharePoint Server front-end Web servers. This does not represent a possible total user base, because in a typical environment a small proportion of total users will concurrently access the system.

10,000

Test Duration

The length of time that the test is run.

1 hour

Web Caching

Whether Web caching is turned on for the front-end Web servers.

On

FAST Content Indexing

Whether FAST Content indexing is operating during the test.

Paused

Number WFEs

The number of front-end Web servers in the SharePoint Server farm that were used during the test

3 per content database

User Ramp

Each test was started off with 1,000 users and ramped to the target user load in increments of 100 users. A 30 second ramp time was used and a 10 second step time.

100 users per 30 seconds

Test Agents

Visual Studio 2010 Ultimate was used to simulate the user transaction load. One test controller virtual machine and 19 test agent virtual machines were used to create this load.

19

Test baseline mix

This section defines the test mixes that were used and provides an overview of the test results for each test mix scenario.

The test mix that was used for each test varied, based on the particular test and load targets. All tests were conducted by using Visual Studio 2010 Ultimate and were recorded to code-free scripts that were generated exclusively by Visual Studio 2010. Specific data points for each test were populated, and then the test mix was run for different periods using different numbers of concurrent users to determine farm capacities and limits.

Note

All tests that were conducted in the lab were run using 10 seconds of think time. Think time is a feature of the Visual Studio 2010 Ultimate Test Controller that simulates the time that users pause between clicks on a page in a real-world environment.

Note

The mix of operations that was used to measure performance for this article is artificial. All results are only intended to illustrate performance characteristics in a controlled environment under a specific set of conditions. These test mixes are made up of an uncharacteristically high amount of list queries that consume a large amount of SQL Server resources compared to other operations. This was intended to provide a starting point for the design of an appropriately scaled environment. After you have completed your initial system design, test the configuration to determine whether your specific environmental variables and mix of operations will vary.

Test Series A – Vary Users

This test series varies the number of users to see how the increased user load affects the system resources in the SharePoint Server and FAST Search Server 2010 for SharePoint farm. Three tests were performed including 4,000 users, 10,000 users, and 15,000 users. The 15,000 user test required an increase of test time to 2 hours to handle the increased user ramp, and it also had increased front-end Web server (WFE) servers to 6 WFEs to handle the increased load.

Test Number of users Number of WFEs Test Time

A.1

4,000

3

1 hour

A.2

10,000

3

1 hour (baseline)

A.3

15,000

6

2 hours

Test Series B – Vary SQL Server RAM

This test series varies the RAM available to Microsoft SQL Server to compare the performance of a SQL Server computer that has a large amount of physical RAM with that of SQL Server computers that have less RAM. Six tests were performed with the maximum SQL Server RAM set to 16 GB, 32 GB, 64 GB, 128 GB, 256 GB, and 600 GB.

Test SQL Server RAM

B.1

16 GB

B.2

32 GB

B.3

64 GB

B.4

128 GB

B.5

256 GB

B.6

600 GB – (baseline)

Test Series C – Vary Search Mix

This test series varies the proportion of searching done by the test users compared to browsing and opening documents. The test workload that is applied to the farm is a mix of different user transactions, which by default follow the baseline of 30% for Open, 40% for Browse, and 30% for Search. Tests in this series vary the proportion of search and therefore also change the proportion of Open and Browse.

Test Open Browse Search

C.1

30%

55%

15%

C.2

30%

40%

30% (baseline)

C.3

20%

40%

40%

C.4

20%

30%

50%

C.5

25%

25%

50%

C.6

5%

20%

75%

Test Series D – Vary WFE RAM

This test series varies the RAM allocated to the four front-end Web servers that were used for this test. The RAM on the four front-end Web servers was tested at 4 GB, 6 GB, 8 GB and 16 GB.

Test WFE Memory

D.1

4 GB

D.2

6 GB

D.3

8 GB - (baseline)

D.4

16 GB

Test Series E – Vary number of WFEs

This test series varies the number of front-end Web servers that are being used. The different number of servers tested was 2, 3, 4, 5, and 6.

Test Number of front-end Web servers

E.1

2

E.2

3 - (baseline)

E.3

4

E.4

5

Test Series F – SQL Server CPU Restrictions

This test series restricts the number of CPUs available to SQL Server. The different number of CPUs available to SQL Server tested was 2, 4, 8, 16 and 80 CPUs.

Test CPUs Available to SQL Server

F.1

4

F.2

6

F.3

8

F.4

16

F.5

80 - (baseline)

Test load

Tests were intended to stay below an optimal load point, or Green Zone, with a general mix of operations. To measure particular changes, tests were conducted at each point that a variable was altered. Test series were designed to exceed the optimal load point in order to find resource bottlenecks in the farm configuration. It is recommended that optimal load point results be used for provisioning production farms so that there is excess resource capacity to handle transitory, unexpected loads. For this project we defined the optimal load point as keeping resources below the following metrics:

  • 75th percentile latency is less than 1 second

  • Front-end Web server CPU is less than 85%

  • SQL Server CPU is less than 50%

  • Application server CPU is less than 50%

  • FAST Search Server 2010 for SharePoint CPU is less than 50%

  • Failure rate is less than 0.01

Resource capture during tests

During each test run, resource usage was captured by using Performance Monitor (Perfmon.exe) and Visual Studio 2010 Ultimate in order to determine the load on the test farm. The following details were captured and are shown in the reports section.

  • CPU for each WFE, SharePoint Server application server, FAST Search Server 2010 for SharePoint Index, Fast Search Service Application (SSA), SQL Server computer

  • RAM usage for each WFE, SharePoint Server application server, FAST Search Server 2010 for SharePoint Index, Fast SSA, SQL Server computer

  • Page refresh time on all test elements

  • Disk queues for each drive

Test Farm Hardware Architecture Details

The Document Center farm is the host for SharePoint Server Central Administration, Document Center 1, Document Center 2, Service Applications and the integrated FAST Search Center. The farm consists of three physical servers and 22 virtual servers.

The following figure shows the physical architecture of the test farm.

Hardware Architecture

Physical architecture of the Document Center farm

The following figure focuses on the physical servers in the Document Center farm.

Physical Servers

Physical Servers in the Document Server farm

Hyper-threading was disabled on the physical servers because additional CPU cores were not needed and were limited to 4 logical CPUs in any one Hyper-V virtual machine. The goal was to avoid any decrease in performance of these servers because of hyper-threading. There were three physical servers in the lab. All three physical servers plus the twenty two virtual servers were connected to a virtual LAN within the lab to isolate their network traffic from other unrelated lab computers. The LAN was hosted by a 1 GBPS Ethernet switch, and each NEC server was connected to two 1 GBPS Ethernet ports.

SPDC01

The Windows Domain Controller and Domain Naming System (DNS) Server for the virtual network that was used in the lab.

4 physical processor cores running at 3.4 GHz

4 GB of RAM

33 GB RAID SCSI Local Disk Device

PACNEC01

The Microsoft SQL Server 2008 R2 that hosts the master and secondary files for content databases, Logs, and TempDB. 100 FAST Document Processors were also run directly on this server.

NEC ExpressServer 5800 1080a

8 Intel E7-8870 CPUs containing 80 physical processor cores running at 2.4 GHz

1 TB of RAM

800 GB of Direct Attached Disk

2x Dual Port Fiber Channel Host Bus Adapter cards capable of 8 GB/s

2x 1 GBPS Ethernet cards

PACNEC02

The Hyper-V Host that serves the SharePoint Server, FAST Search for SharePoint and Test Rig virtual machines within the farm.

NEC ExpressServer 5800 1080a

8 Intel X7560 CPUs containing a total of 64 physical processor cores running at 2.27 GHz

1 TB of RAM

800 GB of Direct Attached Disk

2x Dual Port Fiber Channel Host Bus Adapter cards capable of 8 GB/s

2x 1 GBPS Ethernet cards

Hardware provided by manufacturers

This test was made possible by the support from Microsoft hardware partners.

NEC Corporation of America

NEC provided an NEC Express5800/A1080a (GX) server that contains 8 CPUs (processors) and 1 terabyte of total RAM. Each processor contains 8 cores for a total of 64 cores for the server. As detailed below, this server was used to run Microsoft Hyper-V with several virtual machines that made up the SharePoint Server and FAST Search Server 2010 for SharePoint farms.

Source: www.necam.com/servers/enterprise

Specifications for NEC Express 5800/A1080a server:

  • 8x Westmere CPU (E7-8870) each with 10 processor cores

  • 1 terabyte memory. Each Processor Memory Module has 1 CPU (10 cores) and 16 DIMMs.

  • 2x dual port 8G FC HBA

  • 5 HDDs

Intel

Intel provided a second NEC Express5800/A1080a server that also contained 8 CPUs (processors) and 1 terabyte of RAM. Intel made additional upgrades to this computer to Westmere EX CPUs, which each contained 10 cores for a total of 80 cores for the server. As detailed below, this server was used to run Microsoft SQL Server and FAST Search Server 2010 for SharePoint indexers directly on the computer without using Hyper-V.

EMC

EMC provided an EMC VNX 5700 SAN that contains 300 terabyte of high performance disk.

Source: http://www.emc.com/collateral/software/15-min-guide/h8527-vnx-virt-msapp-t10.pdf

Specifications for EMC VNX 5700:

  • 2 TB drives, 15 per 3U DAE, 5 units = total 75 drives, 150 terabyte raw storage

  • 600 GB drives, 25 per 2U DAE, 10 units = total 250 drives, 150 terabyte raw storage

  • 2x Storage Processors

  • 2x Backup Battery Units

Virtual Servers

These servers all ran on the Hyper-V instance on PACNEC02. All virtual servers started from VHD files that are stored locally on the PACNEC02 server and all had configured access to the lab virtual LAN. Some of these virtual servers were provided direct disk access within the guest operating system to a LUN on the SAN. Direct disk access that was provided increased performance over using a VHD disk and was used for accessing the FAST Search indexes.

Virtual Servers

Virtual Servers in the Document Center farm

The following is a list of the different types of virtual servers that run in the lab and the details of their resources use and services that are provided.

Virtual Server Type Description

Test Rigs (TestRig-1 through TestRig-20)

  • TestRig-1 is the Test Controller from Visual Studio 2010 Ultimate

  • TestRig-2 through TestRig-19 are the Test Agents from Visual Studio Agents 2010 that are controlled by TestRig-1

The Test Controller and Test Agents from Visual Studio 2010 Ultimate for load testing the farm. These virtual servers were configured to use 4 virtual processors and 8 GB memory. These servers used VHD for disk.

SP: Central Admin, Secure Store SA’s, Crawler

  • APP-1 - SharePoint Server Central Administration Host and FAST Search Service Application Host.

  • APP-2 - SharePoint Service Applications and FAST Search Service Application Host. This application server ran the following SharePoint Shared Service Applications:

    • Secure Store Service Application.

    • FAST Search Service Application.

These virtual machines host the SharePoint Server Central Administration and Service Applications that used in the farm. These virtual servers were configured with 4 virtual processors and 16 GB memory. These servers used VHD for disk.

FAST Service and Administration

  • FAST-SSA-1 and FAST-SSA-2 – FAST Search Service Applications 1 and 2 respectively.

  • FAST-IS-1, FAST-IS2, FAST-IS3, and FAST-IS4 - FAST Index, Search, Web Analyzer Nodes 1, 2, 3, and 4.

These virtual machines host the FAST Search Service and Administration. Each virtual machine was configured with 4 virtual processors, 16 GB memory. These servers used VHD for disk.

FAST Index-Search

  • FAST-IS-1, FAST-IS2, FAST-IS3, and FAST-IS4 - FAST Index, Search, Web Analyzer Nodes 1, 2, 3, and 4.

These virtual machines host the FAST Index and the Search and Web Analyzer Nodes that are used in the farm. These servers were configured to use 4 virtual processors, 16 GB memory, and they used VHD for their boot disk. These servers had direct access as disks to 3 terabyte of SAN LUNs for storage of the fast index.

Front-end Web server (SharePoint Server and FAST Search)

  • WFE-1, WFE-2, and WFE-3 - Front-end Web server #1, #2, and #3, part of a Windows load-balancing configuration hosting the first Document Center. These virtual servers were configured to use 4 virtual processors and 8 GB memory.

  • WFE-4, WFE-5, and WFE-6 - Front-end Web server #4, #5, and #6, part of a Windows load-balancing configuration hosting the second Document Center. These virtual servers were configured to use 4 virtual processors and 8 GB memory.

These virtual machines host all of the front-end web servers and a dedicated FAST crawler host within the farm. Each content database contained one Document Center site which was configured to use 3 load-balanced SharePoint Server WFEs. This was to facilitate the text mix for load testing across the two content databases. In a real farm each WFE would target multiple content databases. These servers used VHD for disk.

Disk storage

The storage consists of EMC VNX5700 Unified Storage. The VNX5700 array was connected to each of the physical servers, PACNEC01 and PACNEC02, with 8 GBPS Fiber Channel. Each of these physical servers contains two Fiber Channel host bus adapters so that it can connect to both of the Storage Processors on the primary SAN, which provides redundancy and allows the SAN to balance LUNs across the Storage Processors.

Storage Area Network - EMC VNX5700 Array

An EMC VNX5700 array (http://www.emc.com/products/series/vnx-series.htm\#/1) was used for storage of the SQL Server databases and FAST Search Server 2010 for SharePoint search index. The VNX5700 was configured to include 300 terabyte of raw disk. The array was populated with 250x 600GB 10,000 RPM SAS drives and 75x 2TB 7,200 RPM Near-line SAS drives (near-line SAS drives have SATA physical interfaces and SAS connectors and the regular SAS drives have SCSI physical interfaces). The drives were configured in a RAID-10 format for mirroring and striping. The configured RAID volume in the Storage Area Network (SAN) was split across 3 pools and LUNs are allocated from a specific pool as shown in the following table.

Pool # Description Drive Type User Capacity (GB) Allocated (GB)

0

FAST

SAS

31,967

24,735

1

Content DB

SAS

34,631

34,081

2

Spare – not used

NL SAS

58,586

5,261

The Logical Unit Numbers (LUNs) on the VNX 5700 were defined as shown in the following table.

LUN # Description Size (GB) Server Disk Pool # Drive Letter

0

SP Service DB

1,024

PACNEC01

0

F

1

PACNEC02 additional space

5,120

PACNEC02

0

2

FAST Index 1

3,072

PACNEC02

0

F

3

FAST Index 2

3,072

PACNEC02

0

G

4

FAST Index 3

3,072

PACNEC02

0

H

5

FAST Index 4

3,072

PACNEC02

0

I

6

SP Content DB 1

7,500

PACNEC01

1

H

7

SP Content DB 2

6,850

PACNEC01

1

I

8

SP Content DB 3

6,850

PACNEC01

1

J

9

SP Content DB 4

6,850

PACNEC01

1

K

10

SP Content DB TransLog

2,048

PACNEC01

1

G

11

SP Service DB TransLog

512

PACNEC01

0

L

12

Temp DB

2,048

PACNEC01

1

M

13

Temp DB Log

2,048

PACNEC01

0

N

14

SP Usage Health DB

3,072

PACNEC01

0

O

15

FAST Crawl DB / Admin DB

1,024

PACNEC01

1

P

16

Spare – not used

5,120

PACNEC01

2

17

Bulk Office Doc Content

3,072

PACNEC01

Additional

T

18

WMs Swap Files

1,024

PACNEC02

Additional

K

19

DB Backup 1

16,384

PACNEC01

Additional

R

20

DB Backup 2

16,384

PACNEC01

Additional

S

Storage Area Network - Additional Disk Array

An additional lower performance disk array was used for backup and to host the bulk Office document content that was loaded into the SharePoint Server 2010 farm. This array was not used during test runs.

Test Farm SharePoint Server and SQL Server Architecture

The logical architecture was defined to demonstrate the recommended limits of SharePoint Server 2010. The architecture consists of two Web applications that each contain a single site collection in a single unique content database. Each content database was loaded with 60 million documents of type Microsoft Word (.docx), Excel (.xlsx), PowerPoint (.pptx) and Hyper-text Markup Language (.html) pages. The average size of the documents was 250 kilobytes (KB). Content database size was approximately 15 TB each, for a total corpus of 30 TB. The logical architecture for the large-scale lab is shown in the following figure.

Logical architecture for the large-scale lab

The logical architecture for the large-scale lab

The SharePoint Server Document Center farm is intended to be used in a document archival scenario to accommodate lots of documents stored in several document libraries. Document libraries were limited to approximately one million documents each and a folder hierarchy limited the documents per container to approximately 2,000 items. This structure accommodates the large document loading process and prevents the load time from decreasing after exceeding 1 million items in a document library.

SharePoint Server Farm IIS Web Sites

The two content site collections took advantage of the Document Center template. The Search Center site collection leveraged the FAST Search Center template. Each site collection was in a unique Web application. Each Web application used a separate application pool.

Web sites Description

IIS Web Site – SharePoint Services

The SharePoint Services IIS Web site hosts the shared services used in SharePoint Server 2010. For the purposes of this lab, the, Secure Store was used.

IIS Web Site – SharePoint Central Administration v4

The Central Administration IIS Web Site hosts the Central Administration site and user interface for SharePoint Server 2010.

IIS Web Site – Document Center 1

The Document Center 1 IIS Web Site hosts the first Document Center archive.

IIS Web Site – Document Center 2

The Document Center 2 IIS Web Site hosts the second Document Center archive.

IIS Web Site – FAST Search Center

The Fast Search Center IIS Web Site hosts the search user interface for the farm.

At 70 million items and above the crawl database started to slow noticeably. Tuning work was required to take it from 100 million to 120 million items.

SQL Server Databases

The following SQL Server databases are hosted on the EMC VNX 5700 Storage Area Network (SAN).

DB Name Purpose Size (MB)

SharePointAdminContent_<GUID>

SharePoint Server Central Administration Database

768

SharePoint_Config

SharePoint Server Configuration Database

1,574

System Databases – tempdb

SQL Server Temporary Database

16,384

ReportServer

A Microsoft SQL Server database that stores all report metadata including report definitions, report history and snapshots, and scheduling information.

10

ReportServerTempDB

A Microsoft SQL Server database that stores all of the temporary snapshots while reports are running.

3

SPContent01 (Document Center 1 content database)

SharePoint Server content databases

15,601,286

SPContent02 (Document Center 2 content database)

SharePoint Server content databases

15,975,266

FAST_Query_CrawlStoreDB_<GUID>

Crawler store for the FAST Search Query Search Service Application. This crawl store database is used only for user profiles (People Search).

15

FAST_Query_DB_<GUID>

Administration database for the FAST Search Query Search Service Application.

125

FAST_Query_PropertyStoreDB_<GUID>

Stores the metadata properties and security descriptors for the user profile items in the people search index. It is involved in property-based people search queries and returns standard document attributes for people search query results.

173

FASTContent_CrawlStoreDB_<GUID>

Crawler store for the FAST Search Content Search Service Application. This crawl store database is used for all crawled items except user profiles.

502,481

FASTContent_DB_<GUID>

Administration database for the FAST Search Content Search Service Application.

23

FASTSearchAdminDatabase

Administration database for the FAST Search Server 2010 for SharePoint for SharePoint Server farm. Stores and manages search setting groups, keywords, synonyms, document and site promotions and demotions, property extractor inclusions and exclusions, spell check exclusions, best bets, visual best bets, and search schema metadata.

4

WSS_Content_FAST_Search

FAST Search Center content database.

52

LoadTest2010

Load test results repository

4,099

FAST Search Server 2010 for SharePoint Content Indexes

The FAST Search Server 2010 for SharePoint data directories use a Hyper-V pass through drive directly to the SAN.

On the virtual server FAST-IS1, the data directory is using 745 GB of the 3 terabyte and has no temp space being used (temp items were cleaned up).

The following table shows the data storage in the FAST Search Server 2010 for SharePoint index file folders that are stored on the SAN.

Name Purpose Number Files Size (GB)

data_fixml

Index Source used to Create Index

6 million

223

data_index

Actual Search Index used by Queries

3,729

490

sprel

SharePoint Server Relevancy Information, which is used for boosting popular search results to top of list.

9

3

webanalyzer

Boosting search result order for more frequently linked documents.

135

12

The Method, Project Timeline and Process for Building the Farm

This is the approximate project timeline.

Task Duration

Plan farm architecture

2 weeks

Install server and SAN hardware

1 week

Build Virtual Machines for farm

1 weeks

Creating sample content items

2 weeks

Load items to SharePoint Server

3 weeks

Develop test scripts

1 week

FAST Search indexing of content

2 weeks

Load testing

3 weeks

Report writing

2 weeks

How the sample documents were created

To provide a realistic document archive scenario, document uniqueness was very important. Two separate utilities were used: the first to create a large number of unique documents, and the second to read these files from disk and load them directly into targeted SharePoint Server Web Applications and document libraries.

Bulk Loader tool

Documents were created by using the command-line tool Bulk Loader, which was written using the Microsoft .NET 4.0 Framework. This tool uses a dump file of Wikipedia content as input to enable a user to create up to 10 million unique documents to a disk location. Stock images are used to replace image references from the Wikipedia dumps. BulkLoader is available as source code from https://code.msdn.microsoft.com/Bulk-Loader-Create-Unique-eeb2d084.

LoadBulk2SP tool

Documents were added to SharePoint Server by using the command-line tool LoadBulk2SP, which was written using C# and the Microsoft .NET 3.5 Framework to be compatible with SharePoint Server. LoadBulk2SP tool uses the disk output files from the Bulk Loader tool as input and mimics the same folder and file structure directly into SharePoint Server by using targeted web applications and document libraries that are specified in the application configuration. LoadBulk2SP was used to load over 100 million 250 KB documents into SharePoint Serverwith a peak performance of 233 documents-per-second, and an overall average load time of 137 documents-per-second. LoadBulk2SP is available as source code on https://code.msdn.microsoft.com/Load-Bulk-Content-to-3f379974.

Performance characteristics for large-scale document load

Documents were loaded by using the LoadBulk2SP tool. This tool uses the SubFolderCollection.Add() method to add new folders to specified SharePoint Server document libraries, and the SPFileCollection.Add() method to add files directly into the document library folders. The folder and file structure that is created in SharePoint Server mimics the output hierarchy that is created by the Bulk Loader tool.

Document Library content database sizes

The following table shows the size details of each document library content database, including SQL Server Filegroups, Primary and Secondary files that are used in the farm.

SQL Content File FileGroup LUN Size (KB) Size (MB) Size (GB) Size (TB)

SPCPrimary01.mdf

Primary

H:/

53,248

52.000

0.050

0.000

SPCData0102.mdf

SPCData01

I:/

3,942,098,048

3,849,697.312

3,759.470

3.671

SPCData0103.mdf

SPCData01

J:/

4,719,712768

4,609,094.500

4,501.068

4.395

SPCData0104.mdf

SPCData01

K:/

3,723,746,048

3,636,470.750

3,551.240

3.468

SPCData0105.mdf

SPCData01

H:/

3,371,171,968

3,292,160.125

3,215.000

3.139

SPCData0106.mdf

SPCData01

O:/

4,194,394

4,096.087

4.000

0.003

Document Center 1

Totals:

15,760,968,474

15,391,570.775

15,030.820

14.678

SPCPrimary02.mdf

SPCData02

H:/

52,224

51.00

0.049

0.000

SPCData0202.mdf

SPCData02

I:/

3,240,200,064

3,164,257.875

3,090.095

3.017

SPCData0203.mdf

SPCData02

J:/

3,144,130,944

3,070,440.375

2,998.476

2.928

SPCData0204.mdf

SPCData02

K:/

3,458,544,064

3,377,484.437

3,298.324

3.221

SPCData0205.mdf

SPCData02

H:/

3,805,828,608

3,716,629.500

3,629.521

3.544

SPCData0206.mdf

SPCData02

O:/

2,495,168,448

2,436,687.937

2,379.578

2.323

Document Center 2

Totals:

16,143,924,352

15,765,551.125

15,396.046

15.035

Corpus Total:

31,904,892,826

31,157,121.900

30,426.876

29.713

Document Library hierarchies, folders and files

The following are the details of the document library hierarchies and the total number of folders and documents that are generated by the LoadBulk2SP tool for each Document Center. The totals across both Document Centers are for 60,234 and 120,092,033 files.

Document Center 1

The total number of folders and files that are contained in each document library in the content database are shown in the following table. Documents were limited to 1 million per document library strictly for the purposes of a large content load process. For SharePoint Server farm architecture results and advice related to large document library storage, see Estimate performance and capacity requirements for large scale document repositories in SharePoint Server 2010, which focuses on the performance characteristics of document libraries as size increases and the throughput of high volume operations.

For more information about software boundaries and limits of items in document libraries and items in content databases, see SharePoint Server 2010 capacity management: Software boundaries and limits.

Document Center 1

Document Library

Folders

Files

DC1 TOTALS:

30,447

60,662,595

Document Center 2

The total number of folders and files that are contained in each document library in the content database are shown in the following table.

Document Center 2

Document Library

Folders

Files

DC2 TOTALS:

29,787

59,429,438

DC1 TOTALS:

30,447

60,662,595

Corpus Totals:

60,234

120,092,033

The following table shows statistical samples from the top five test runs of the LoadBulk2SP tool that used four concurrent processes, with each process using 16 threads that targeted different Document Centers, document libraries and input folders and files.

Run 26: Time Seconds Folders Files Docs/Sec

5 Folders @ 2k files

Hours

0

0

315

639,980

233

Minutes

45

2,700

Seconds

46

46

58264

Total:

2,746

Run 9: Time Seconds Folders Files Docs/Sec

30 Folders @ 2k files

Hours

5

18,000

1,920

3,839,864

178

Minutes

58

3,480

Seconds

46

46

Total:

21,526

Run 10: Time Seconds Folders Files Docs/Sec

30 Folders @ 2k files

Hours

6

21,600

1,920

3,839,881

162

Minutes

33

1,980

Seconds

50

50

Total:

23,630

Run 8: Time Seconds Folders Files Docs/Sec

30 Folders @ 2k files

Hours

6

21,600

1,920

3,839,857

155

Minutes

51

3,060

Seconds

30

30

Total:

24,690

Run 7: Time Seconds Folders Files Docs/Sec

30 Folders @ 2k files

Hours

6

21,600

1,920

3,839,868

154

Minutes

55

3,300

Seconds

0

0

Total:

24,900

Input-Output operations per second (IOPS)

SQLIO is a stress tool that is used to determine the I/O capacity of a given configuration. SQLIO was run on the system after performance tests were completed. Therefore, several disks backed by SAN LUNs could not be included as they had too much existing data on them.

IOPS as tested with SQLIO tool

The SQLIO test runs on each drive letter individually and then runs a test on all drives at one time. You can see the IOPS/GB in the right column, which is calculated by dividing the IOPS by the drive capacity. For these drives all being tested at the same time, 105,730 IOPS was achieved.

LUN LUN Description Size (GB) Reads IOPS (MAX) Writes IOPS (MAX) Total IOPS (MAX) IOPS per GB

F:

SP Service DB

1024

2,736

23,778

26,514

25.89

G:

Content DBs TranLog

2048

3,361

30,021

33,383

16.30

L:

Service DBs TranLog

512

2,495

28,863

31,358

61.25

M:

TempDB

2048

2,455

21,778

24,233

11.83

TempDB Log

2048

2,751

29,522

32,273

15.76

N:

Content DBs 5

3,072

2,745

28,767

31,511

10.26

O:

Crawl/Admin DBs

1024

2,603

22,808

25,411

24.81

P:

All at once

11776

16,665

89,065

105,730

8.98

Total:

11,776

19,145

185,536

310,412

Average:

1,682

2,735

26,505

38,801

22

IOPS achieved during load testing

Performance Monitor jobs were run consistently together with concurrent FAST Indexing, content loading, and Visual Studio 2010 load tests running. The following table reflects the maximum IOPS achieved by LUN and identifies each LUN, and its description, total size, maximum reads, maximum writes, totals IOPS, and IOPS per GB.

Because these results were obtained during testing, the results reflect the IOPS that the test environment was able to drive into the SAN. Because drives H:, I:, J:, and K: could be included, the total IOPS achieved was much higher than for the SQLIO testing.

LUN LUN Description Size (GB) Reads IOPS (MAX) Writes IOPS (MAX) Total IOPS (MAX) IOPS per GB

G:

Content DBs TranLog

2048

5,437

11,923

17,360

8.48

H:

Content DBs 1

6,850

5,203

18,546

23,749

3.47

I:

Content DBs 2

6,850

5,284

11,791

17,075

2.49

J:

Content DBs 3

7,500

5,636

11,544

17,180

2.29

K:

Content DBs 4

7,500

5,407

11,146

16,553

2.42

L:

Service DBs TranLog

512

5,285

10,801

16,086

31.42

M:

TempDB

2048

5,282

11,089

16,371

7.99

N:

TempDB Log

2048

5,640

11,790

17,429

8.51

O:

Content DBs 5

3072

5,400

11,818

17,218

5.60

P:

Crawl/Admin DBs

1024

5,249

11,217

16,467

16.08

Total:

31,365

53,824

121,667

175,491

Average:

3,136

5,382

12,167

17,549

5.60

FAST Search Server 2010 for SharePoint document crawling

Crawling SharePoint Server sites for search is performed by the crawler that is configured to feed to the FAST Content Distributors. The content Search Service Application (SSA) was configured to run on two servers, APP-1 and APP-2, and the query SSA was run on the servers FAST-1 and FAST-2.

100 FAST indexing document processors were run on the SQL Server computer. The following screen from task manager on the computer shows the activity while both document processing and a 10,000 user load test were running with SQL Server on this computer.

Task Manager on PACNEC01 during FAST Indexing and Load Test

Task Manager during FAST Indexing and Load Test

Results from Testing

In order to generate a significant load during testing, the following software was used: Visual Studio 2010 Ultimate, Visual Studio 2010 Load Control, and Visual Studios Agents 2010. A Test Rig is required to simulate many users and produce a significant load. A Test Rig is made up of a Test Controller computer and one or more Test Agent computers. The test controller manages and coordinates with agent computers, and the agents are used to generate load against SharePoint Server. The Test controller is also responsible for collecting performance monitor data from the computers that are under test and from the agent computers.

This section identifies the results of the performance test runs.

Test Series A – Vary Users

In this test series, the number of users loaded onto the test farm very. The following figure shows the requests per second that the Visual Studio 2010 Ultimate Test Controller was able to process through the SharePoint Server farm during the tests for each of the user load sizes. The chart shows that as additional user load is applied, the requests increase because of the larger user number. When the test reaches 15,000 users, the farm is heavily loaded. Consequently, the requests do not increase as much as the applied load.

Because the 15,000 user test took additional time to ramp up, this test ran for 2 hours instead of the baseline of 1 hour. Due to the load, we also found that 3 front-end Web servers were insufficient. We ran this test with 6 front-end Web servers.

Average RPS for series A

Average RPS for series A chart

In the following graph, you can see that test transaction response time increases together with the page refresh time for the large 15,000 user test. This shows that there is a bottleneck in the system for this large user load. We experienced high IOPS load on the H: drive, which contains the primary data file for the content database during this test. This area could have been investigated further to remove this bottleneck.

Times and WFEs used for series A

Times and WFEs Used for Series A chart

In the following graph, you can see the increasing CPU use as the user load is moved from 4,000 to 10,000 users. CPU use is reduced just for the front-end Web servers (WFEs) as the number of WFEs is doubled from 3 to 6. At the low end of the chart, notice that the APP-1 server has fairly constant CPU use, and the large PACNEC01 SQL Server computer does not reach 3% of total CPU use.

Average CPU use for series A

Average CPU Use for Series A chart

The following table shows a summary of data captured during the three tests in test series A. Data items that show “NA” were not captured.

Test A.1 A.2 A.3

Users

4,000

4,000

15,000

WFEs

3

3

6

Duration

1 hour

1 hour

2 hours

Avg RPS

96.3

203

220

Avg Page Time

0.31 sec

0.71 sec

19.2 sec

Avg Response Time

0.26 sec

0.58 sec

13.2 sec

Avg CPU WFE-1

22.3%

57.3%

29.7%

Available RAM WFE-1

5,828

5,786

13,311

Avg CPU WFE-2

36.7%

59.6%

36.7%

Available RAM WFE-2

5,651

5,552

13,323

Avg CPU WFE-3

22.8%

57.7%

34%

Available RAM WFE-3

5,961

5,769

13,337

Avg CPU PACNEC01

1.29%

2.37%

2.86%

Available RAM PACNEC01

401,301

400,059

876,154

Avg CPU APP-1

6.96%

14.5%

13.4%

Available RAM APP-1

13,745

13,804

13,311

Avg CPU APP-2

0.73%

1.09%

0.27%

Available RAM APP-2

14,815

14,992

13,919

Avg CPU WFE-4

NA

NA

29.7%

Available RAM WFE-4

NA

NA

13,397

Avg CPU WFE-5

NA

NA

30.4%

Available RAM WFE-5

NA

NA

13,567

Avg CPU WFE-6

NA

NA

34.9%

Available RAM WFE-6

NA

NA

13,446

Avg Disk Write Queue Length, PACNEC01 H: SPContent DB1

0.0 (with peak of 0.01)

0.0 (with peak of 0.02)

0.3 (with peak of 24.1)

Test Series B – Vary SQL Server RAM

In this test series the available RAM to SQL Server is varied. The following figure shows that the requests-per-second was not affected by the RAM allocated to SQL Server.

Average Requests per Second for series B

Average Requests per Second for series B chart

Page and transaction response times for series B

Page and transaction response for series B chart

The following graph shows the CPU use for the front-end Web servers (WFE), the App Server, and the SQL Database Server. The three WFEs were constantly busy for all tests, the App Server is mostly idle, and the database server does not increase above 3% CPU usage.

Average CPU use for series B

Average CPU Use for series B chart

Available RAM for series B

Available RAM for series B chart

The following table shows summary of data that was captured during the three tests in test series B.

Test B.1 B.2 B.3 B.4 B.5 B.6

SQL RAM

16 GB

32 GB

64 GB

128 GB

256 GB

600 GB

Avg RPS

203

203

203

204

203

202

Avg Page Time

0.66

0.40

0.38

0.42

0.58

0.89

Avg Response Time

0.56

0.33

0.31

0.37

0.46

0.72

Avg CPU WFE-1

57.1%

58.4%

58.8%

60.6%

60%

59%

Available RAM WFE-1

6,239

6,063

6,094

5,908

5,978

5,848

Avg CPU WFE-2

55.6%

60.1%

57.1%

59.6%

60.3%

58.1%

Available RAM WFE-2

6,184

6,079

6,141

6,119

5,956

5,828

Avg CPU WFE-3

59.4%

56%

56.9%

58.4%

61.4%

59.8%

Available RAM WFE-3

6,144

6,128

6,159

6,048

5,926

5,841

Avg CPU PACNEC01

2.84%

2.11%

2.36%

2.25%

2.38%

2.29%

Available RAM PACNEC01

928,946

923,332

918,526

904,074

861,217

881,729

Avg CPU APP-1

14.3%

12.6%

13.3%

12.5%

13.4%

13.8%

Available RAM APP-1

14,163

14,099

14,106

14,125

14,221

14,268

Avg CPU APP-2

1.29%

1.14%

1.2%

1.2%

1.03%

0.96%

Available RAM APP-2

15,013

14,884

14,907

14,888

14,913

14,900

Test Series C – vary transaction mix

In this test series, the proportion of search transactions in the workload mix is varied.

Average RPS for series C

Average RPS for series C chart

The following graph shows that test C.5 had significantly longer page response times, which indicates that the SharePoint Server 2010and FAST Search Server 2010 for SharePoint farm was overloaded during this test.

Page and transaction response times for series C

Page and transaction times for series C chart

Average CPU times for series C

Average CPU Time for series C chart

Average RAM for series C

Average RAM for series C chart

The following table shows a summary of data that was captured during the three tests in test series C.

Test C.4 C.2 (baseline) C.1 C.2 C.3 C.5

Open

30%

30%

20%

20%

25%

5%

Browse

55%

40%

40%

30%

25%

20%

Search

15%

30%

40%

50%

50%

75%

Avg RPS

235

203

190

175

168

141

Avg Page Time (S)

1.19

0.71

0.71

0.43

0.29

25.4

Avg Response Time (S)

0.87

0.58

0.20

0.33

0.22

16.1

Avg CPU WFE-1

62.2%

57.30%

44.2%

40.4%

36.1%

53.1%

Available RAM WFE-

14,091

5,786

6,281

6,162

6,069

13,766

Avg CPU WFE-2

65.2%

59.60%

45.2%

40.1%

37.6%

58.8%

Available RAM WFE-2

13,944

5,552

6,271

6,123

6,044

13,726

Avg CPU WFE-3

65.3%

57.70%

49.4%

44.2%

39.6%

56.8%

Available RAM WFE-3

13,693

5,769

6,285

6,170

6,076

13,716

Avg CPU PACNEC01

2.4%

2.37%

2.6%

2.51%

2.32%

3.03%

Available RAM PACNEC01

899,613

400,059

814,485

812,027

808,842

875,890

Avg CPU APP-1

8.27%

14.50%

17.8%

20.7%

18.4%

16.2%

Available RAM APP-1

13,687

13,804

14,002

13,991

13,984

13,413

Avg CPU APP-2

0.28%

NA

0.88%

0.8%

0.79%

0.14%

Available RAM APP-2

13,916

NA

14,839

14,837

14,833

13,910

Avg CPU FAST-1

8.39%

NA

NA

NA

NA

16.6%

Available RAM FAST-1

13,998

NA

NA

NA

NA

13,686

Avg CPU FAST-2

8.67%

NA

NA

NA

NA

16.7%

Available RAM FAST-2

14,135

NA

NA

NA

NA

13,837

Avg CPU FAST-IS1

37.8%

NA

NA

NA

NA

83.4%

Available RAM FAST-IS1

2,309

NA

NA

NA

NA

2,298

Avg CPU FAST-IS2

30.2%

NA

NA

NA

NA

66.1%

Available RAM FAST-IS2

5,162

NA

NA

NA

NA

5,157

Avg CPU FAST-IS3

30.6%

NA

NA

NA

NA

69.9%

Available RAM FAST-IS3

5,072

NA

NA

NA

NA

5,066

Avg CPU FAST-IS4

25.6%

NA

NA

NA

NA

58.2%

Available RAM FAST IS-4

5,243

NA

NA

NA

NA

5,234

Test Series D – Vary Front-End Web Server RAM

In this test series, the RAM on each front-end Web server virtual machine is varied.

Average RPS for series D

Average RPS for series D chart

Page and transaction response times for series D

Page and transaction time for series D chart

Average CPU times for series D

Average CPU Time for series D chart

The following graph shows that the available RAM on each front-end Web server is always the RAM allocated to the virtual machine less 2 GB. This shows that for the 10,000 user load and this test transaction mix, the front-end Web servers require a minimum of 2 GB of RAM plus any reserve.

Available RAM for series D

Available RAM for series D chart

The following table shows a summary of data that was captured during the three tests in test series D.

Test D.1 D.2 D.3 D.4

WFE RAM

4 GB

6 GB

8 GB

16 GB

Avg RPS

189

188

188

188

Avg Page Time (S)

0.22

0.21

0.21

0.21

Avg Response Time (S)

0.17

0.16

0.16

0.16

Avg CPU WFE-1

40.5%

37.9%

39.6%

37.3%

Available RAM WFE-1

2,414

4,366

6,363

14,133

Avg CPU WFE-2

42.3%

40%

40.3%

39.5%

Available RAM WFE-2

2,469

4,356

6,415

14,158

Avg CPU WFE-3

42.6%

42.4%

42.2%

43.3%

Available RAM WFE-3

2,466

4,392

6,350

14,176

Avg CPU PACNEC01

2.04%

1.93%

2.03%

2.14%

Available RAM PACNEC01

706,403

708,725

711,751

706,281

Avg CPU APP-1

11.8%

13.1%

12.9%

12.3%

Available RAM APP-1

13,862

13,866

13,878

13,841

Avg CPU APP-2

0.84%

0.87%

0.81%

0.87%

Available RAM APP-2

14,646

14,650

14,655

14,636

Avg CPU WFE-4

42.3%

43.6%

41.9%

45%

Available RAM WFE-4

2,425

4,342

6,382

14,192

Test Series E – Vary number of front-end Web servers

In this test series the number of front-end Web servers in the farm is varied. The following figure shows that the average RPS slightly lower with 2 and 3 front-end Web servers as the system does not completely keep up with the applied user load. But with 4, 5 or 6 front-end Web servers, requests-per-second is constant as the system is handling the full load from the test agents.

Average RPS for Series E

Average RPS for Series E chart

A similar pattern is shown in the following graph where you can see the response times high for 2 and 3 WFEs and then very low for the increased numbers of front-end Web servers.

Page and transaction time for Series E

Page and transaction times for series E chart

The following graph shows that the CPU time is lower when more front-end Web servers are available. Six front-end Web servers clearly reduces the average CPU utilization across the front-end Web servers, but only four front-end Web servers are required for the 10,000 user load. Notice that you cannot tell from this chart which configurations are handling the load and which configurations are not. For three front-end Web servers, which we identified as not completely handling the load, the front-end Web server CPU is just over 50%.

Average CPU for Series E

Average CPU for Series E chart

Available RAM for Series E

Available RAM for Series E chart

The following table shows a summary of data that was captured during the three tests in test series E.

Test E.1 E.2 E.3 E.4 E.5

WFE Servers

2

3

4

5

6

Avg RPS

181

186

204

204

205

Avg Page Time (S)

8.02

0.73

0.23

0.20

0.22

Avg Response Time (S)

6.34

0.56

0.19

0.17

0.18

Avg CPU WFE-1

77.4

53.8

45.7

39.2

32.2

Available RAM WFE-1

5,659

6,063

6,280

6,177

6,376

Avg CPU WFE-2

76.2%

53.8%

45.9%

38.2%

28.8%

Available RAM WFE-2

5,623

6,132

6,105

6,089

5,869

Avg CPU WFE-3

NA

52.5%

43.9%

37.7%

31.2%

Available RAM WFE-3

NA

6,124

6,008

5,940

6,227

Avg CPU WFE-4

NA

NA

44.5%

34.8%

34.7%

Available RAM WFE-4

NA

NA

6,068

6,068

6,359

Avg CPU WFE-5

NA

NA

NA

35.1%

32%

Available RAM WFE-5

NA

NA

NA

6,090

6,245

Avg CPU WFE-6

NA

NA

NA

NA

33.9%

Available RAM WFE-6

NA

NA

NA

NA

5,893

Avg CPU PACNEC01

2.13%

1.93%

2.54%

2.48%

2.5%

Available RAM PACNEC01

899,970

815,502

397,803

397,960

397,557

Avg CPU APP-1

9.77%

11.7%

15%

14.7%

13.6%

Available RAM APP-1

14,412

13,990

14,230

14,227

14,191

Avg CPU APP-2

1.06%

0.92%

1%

1%

1.04%

Available RAM APP-2

14,928

14,841

14,874

14,879

14,869

Test Series F – Vary SQL Server CPUs

In this test series the number of CPUs that are available to SQL Server varies.

Average RPS for series F

Average RPS for series F chart

The following graph shows that despite minimal CPU use on the SQL Server computer, the page and transaction response times increases when SQL Server has fewer CPUs available to work with.

Page and transaction time for series F

Page and transaction time for series F chart

The following graph shows that the SQL Server average CPU usage for the whole computer does not exceed 3%. CPU usage for the three front-end Web servers was approximately 55% throughout the tests.

Average CPU for series F

Average CPU use for series F chart

Available RAM for series F

Available RAM for series F chart

The following table shows a summary of data that were captured during the three series F tests.

Test F.1 F.2 F.3 F.4 F.5

SQL CPUs

4

6

8

16

80

Avg RPS

194

200

201

203

203

Avg Page Time (S)

4.27

2.33

1.67

1.2

0.71

Avg Response Time (S)

2.91

1.6

1.16

0.83

0.58

Avg CPU WFE-1

57.4%

57.4%

56.9%

55.5%

57.30%

Available RAM WFE-1

13,901

13,939

13,979

14,045

5,786

Avg CPU WFE-2

60.3%

58.9%

62.6%

61.9%

59.60%

Available RAM WFE-2

13,920

14,017

13,758

14,004

5,552

Avg CPU WFE-3

56.8%

62%

61%

62.1%

57.70%

Available RAM WFE-3

13,859

13,942

13,950

13,971

5,769

Avg CPU PACNEC01

1.56%

2.57%

2.69%

2.69%

2.37%

Available RAM PACNEC01

865,892

884,642

901,247

889,479

400,059

Avg CPU APP-1

12.5%

12.8%

12.8%

12.8%

14.50%

Available RAM APP-1

13,856

13,713

13,725

13,745

13,804

Avg CPU APP-2

0.22%

0.25%

0.26%

0.25%

NA

Available RAM APP-2

14,290

14,041

14,013

13,984

NA

Avg CPU FAST-1

12.8%

13%

13%

13%

NA

Available RAM FAST-1

13,913

14,051

14,067

14,085

NA

Avg CPU FAST-2

12.9%

13.4%

13.3%

13.5%

NA

Available RAM FAST-2

14,017

14,170

14,183

14,184

NA

Service Pack 1 (SP1) and June Cumulative Update (CU) Test

After the SharePoint Server 2010 farm was fully populated with 120 million items, SharePoint Server 2010 with SP1 and FAST Search Server 2010 for SharePoint SP1 were applied to determine how long the process would take on a large populated farm.

SharePoint Server 2010

Microsoft SharePoint Server 2010 with Service Pack 1 (SP1) and the June Cumulative Update were applied in the lab to determine a base upgrade time for a large-scale Document Center farm scenario. The following table reflects the servers in the farm that required the SP1 and June CU upgrades, the start and end time of each install, the total time of installs, the start and end time of PSCONFIG upgrade command, the total time of PSCONFIG upgrade command, the total time of the upgrade by server name, and the total installation times.

Server Name SP1 Start SP1 End Diff (h:mm:ss) June CU Start June CU End Diff (h:mm:ss) PSConfig Start PSConfig End Diff (h:mm:ss)

APP-1

7/12/2011 4:00:00

7/12/2011 4:15:51

0:15:51

7/29/2011 10:45:00

7/29/2011 11:00:05

0:15:05

7/29/2011 13:25:50

7/29/2011 13:30:15

0:04:25

APP-2

7/12/2011 4:26:07

7/12/2011 4:39:31

0:13:24

7/29/2011 11:02:30

7/29/2011 11:17:23

0:14:53

7/29/2011 13:33:15

7/29/2011 13:35:11

0:01:56

WFE-1

7/12/2011 4:41:05

7/12/2011 4:49:16

0:08:11

7/29/2011 11:23:00

7/29/2011 11:31:07

0:08:07

7/29/2011 13:36:35

7/29/2011 13:38:11

0:01:36

WFE-2

7/12/2011 4:50:24

7/12/2011 4:57:47

0:07:23

7/29/2011 11:32:45

7/29/2011 11:40:46

0:08:01

7/29/2011 13:39:20

7/29/2011 13:40:54

0:01:34

WFE-3

7/12/2011 4:59:00

7/12/2011 5:06:39

0:07:39

7/29/2011 11:42:00

7/29/2011 11:49:47

0:07:47

7/29/2011 13:42:40

7/29/2011 13:44:14

0:01:34

WFE-4

7/12/2011 5:10:060

7/12/2011 5:17:30

0:07:24

7/29/2011 11:51:00

7/29/2011 11:58:49

0:07:49

7/29/2011 13:46:05

7/29/2011 13:47:41

0:01:36

WFE-5

7/12/2011 5:18:49

7/12/2011 5:27:07

0:08:18

7/29/2011 11:59:45

7/29/2011 12:08:19

0:08:34

7/29/2011 13:49:00

7/29/2011 13:50:36

0:01:36

WFE-6

7/12/2011 5:28:25

7/12/2011 5:35:40

0:07:15

7/29/2011 12:09:30

7/29/2011 12:17:10

0:07:40

7/29/2011 13:52:00

7/29/2011 13:53:35

0:01:35

WFE-CRAWL1

7/12/2011 5:37:20

7/12/2011 5:44:35

0:07:15

7/29/2011 12:18:10

7/29/2011 12:25:51

0:07:41

7/29/2011 13:54:35

7/29/2011 13:56:19

0:01:44

FAST-SSA-1

7/12/2011 5:49:00

7/12/2011 5:57:45

0:08:45

7/29/2011 12:39:40

7/29/2011 12:48:24

0:08:44

7/29/2011 13:57:30

7/29/2011 13:59:07

0:01:37

FAST-SSA-2

7/12/2011 5:59:08

7/12/2011 6:08:29

0:09:21

7/29/2011 12:51:30

7/29/2011 13:00:11

0:08:41

7/29/2011 14:00:00

7/29/2011 14:01:58

0:01:58

Total Time:

1:40:46

1:43:02

0:21:11

Grand Total:

3:44:59

FAST Search Server 2010 for SharePoint

The FAST Search Server 2010 for SharePoint Service Pack 1 (SP1) upgrade required approximately 15 minutes per node to upgrade.

SQL Server content database backups

A SQL Server database backup was executed on the content database for Document Center 1 (SPContent01). A backup (B/U) was performed pre-SP1, June Cumulative Update (CU), and post-SP1. The following table shows the backup time and size details.

Database Name B/U Start B/U End Diff (h:mm:ss) Size (TB) Notes

SPContent01

7/10/2011 9:56:00

7/10/2011 23:37:00

13:41:00

14.40

Pre-SP1

SPContent01

7/29/2011 14:22:10

7/30/2011 4:28:00

14:05:50

14.40

Post- SP1 / June CU

Conclusions

The SharePoint Server 2010 farm was successfully tested at 15,000 concurrent users by using two SharePoint Server content databases that included a total of 120 million documents. The SharePoint Server 2010 farm with three front-end Web servers, as was specified in the baseline environment, was not able to support the load of 15,000 concurrent users. Six front-end Web servers were required for this load.

Recommendations

The following is a summary list of the recommendations. In each section the hardware notes are not intended to be a comprehensive list, but to indicate the minimum hardware that was found to be required for the 15,000 concurrent user load test against a 120 million document SharePoint Server 2010 farm.

  • Hardware notes for load:

    • 64 GB RAM on SQL Server

    • 16 CPU cores on SQL Server

  • Provide sufficient 2 IO Capability per Second per GB stored in the SharePoint Server 2010 content database

  • Set Microsoft SQL Server 2008 R2 property Maximum Degree of Parallelism (MAXDOP)=1; the default is 0

  • Use multiple LUNs (drive letters) on SAN each with a SQL Server data file and one virtual CPU allocated for each LUN used. We used 5 data files all on separate LUNs

  • Hardware notes for load:

    • 8 GB RAM on each front-end Web server

    • 6 front-end Web servers

  • Add the Disable Loopback Check Registry Key at \HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\DisableLoopbackCheck=1

  • Reduce table index fragmentation issues manually during bulk document import by running ALTER INDEX on the affected table indexes.

  • Prefer SPFileCollection.ADD for bulk import of documents over creating duplicate documents by using SPFolder.CopyTo.

  • Hardware notes for load:

    4 rows of FAST Search Server 2010 for SharePoint index servers

  • Registry Updates for SharePoint Server 2010 document crawler

    On nodes that run the FAST Content SSA crawler (APP-1 and APP-2), the following registry values were updated to improve the crawler performance in the hive:

    HKLM\SOFTWARE\Microsoft\Office Server\14.0\Search\Global\Gathering Manager

    1. FilterProcessMemoryQuota

      Default value of 100 megabytes (MB) was changed to 200 MB

    2. DedicatedFilterProcessMemoryQuota

      Default value of 100 megabytes (MB) was changed to 200 MB

    3. FolderHighPriority

      Default value of 50 was changed to 500

  • Monitor the FAST Search Server 2010 for SharePoint Index Crawl

    The crawler should be monitored at least three times per day. In these tests, 100 million items took about two weeks to crawl. The following four items were checked every time that the crawl was monitored:

    1. rc –r | select-string “# doc”

      Checks how busy the document processors are

    2. Monitoring crawl queue size

      Use reporting or SQL Server Management Studio to see MSCrawlURL

    3. Indexerinfo –a doccount

      Make sure all indexers are reporting to see how many documents are indexed in 1000 milliseconds. This number ran from 40 to 120 depending on the type of documents being indexed at the time.

    4. Indexerinfo –a status

      Monitor the health of the indexers and partition layout

References