I’m all about the free stuff.
Realtime Publishers publishes dozens of free e-books on their website on a variety of technical topics.
This 103-page e-book by Wendy Henry, part of the “Realtime Shortcut Guide” series, is aimed at SharePoint architects, system administrators, and anyone involved in planning the rollout of SharePoint in their environment.
The first chapter lays out the case why SharePoint’s role in user collaboration, as well as SharePoint’s intrinsic nature of being a collaboration between the operating system, IIS, SQL Server and the components of the SharePoint application itself, raises challenges for planning for both capacity growth as well as performance. Starting with collecting the capacity requirements from the Microsoft deployment guides, the book examines where these guides fall short and why. Things the deployment guides might fail to adequately provide for include Features, workflows, galleries, metadata, and the organic nature of data growth in a system that is largely user-defined. But beyond these, the book also takes into account the way data is stored – the way the NTFS file system can fragment over time, the extent and page overheads in SQL data, and the ways RAID striping impact data storage. The breadth of considerations included in this chapter makes it a valuable tool in planning for capacity needs. With links to performance tools and advice on performance monitoring, the chapter provides some solid advice on measuring the health and performance of your storage solution.
The second chapter starts with suggestions for tuning Windows for running SharePoint, along with warnings about potential dangers for using those suggestions when warranted. This is extended to include both memory and processor specifications, as both of these can impact disk needs and performance. This is followed by information on optimizing IIS, including how choosing to use an authentication system other than Windows Authentication will mean increased disk requirements for the authentication. Where it starts to get interesting is the optimization of SQL server, which can have a big impact on both performance and on the disk space used, as well as the architechture for the storage system.
The book strongly recommends using the 64-bit version of SQL Server to take advantage of larger memory addressing capabilities, which will in turn reduce disk thrashing and boost performance, though if you are stuck with 32-bit, there are several options listed for enhancing memory performance. Also recommended is moving the tempdb database off of the default (C) drive, as well as monitoring several counters to have an early warning of potential problems. Compression utilities don’t provide much return for the processing overhead, so they are not recommended.
The SharePoint databases stored on the SQL Server are another prime target for performance optimization, and and are where many of the storage utilization issues are going to be focused. Optimizing SharePoint configuration databases by placing them on disk sets separate from the OS, paging files, log files, and SQL server application files is recommended, as is keeping the content databases separated from their log files – the content databases are highly normalized and very volatile, and as such benefits the most from exclusive use of disk IO. The Administration Database and logs, however, should have little change as customization of the Administration site is highly discouraged by Microsoft best practices, and can safely reside on the same disk. SharePoint Search, because of its intensive IO requirements, is recommended to be on its own multi-spindle storage solution.
The final section of Chapter two deals with storage options, and covers internal disks, DAS, NAS and SAN. The author recommends a SAN solution because, despite the costs involved, the solution has the most flexibility, the best disaster recovery options and highest availability.
Chapter three had me concerned from the outset. The book is sponsored by an iSCSI solution provider, and it seemed to me that focusing a quarter of the book on a particular storage solution cast a little doubt on the independence of the information being provided. While there is a short section of you iSCSI is preferable for SharePoint over Fibre Channel, I was left with some concern that the section should have had a bit more depth to it. The section briefly mentions security protocols, disaster recovery options, and implementation and administration costs as reasons to choose iSCSI, but gave little detail on why these were true. For readers lacking in-depth knowledge of SAN fabrics, a little more detail (or a reference to some) would help in presenting the case to their stakeholders.
In any case, Chapter three provides an overview of iSCSI architecture and implementation, and some hints about network monitoring. It also discuses more general SAN topics, like the benefits of abstracting the disk management, file placement, and disaster recovery from the initiating servers (the SQL and other servers using the data on the SAN). Thin provisioning is recommended to get the most out of your hardware investment and still maintain the data availability as the storage needs grow (although it is not recommended for space allocated for the transaction log files). However, if thin provisioning is chosen, the book recommends regularly auditing the storage usage to track usage trends. The chapter also provides a set of best practices for iSCSI, including multipath IO, compression, encryption, and performance monitoring.
Chapter four covers disaster recovery and high availability. While some of the chapter focuses on basics like RAID, there is plenty of coverage of more sophisticated topics like SAN replication, Virtual Machines, warm standbys, and database mirroring.
I got a lot out of reading the book. I refreshed my understanding of the structure and relationship of the databases created by SharePoint, learned a lot about SAN storage, and gave me some creative avenues to think about when considering how to deploy SharePoint. Also, I now have a tool to refer back to when planning a deployment (which I am hopeful might be in the near future!) so I can avoid missteps.
The book has a couple of inevitable flaws. First, the Shortcut Guide series, by it nature, has a fairly narrowly defined scope. You won’t find a lot of in-depth discussion of some of the topics outside the scope but relevant to the subject and that affect SharePoint storage, like some of the finer points of SQL optimization and fine-tuning OLTP and OLAP databases. However, these topics are referred to as being out-of-scope, and the author still provides some general best practices and references for further, more in-depth reading.
The second major inevitable flaw comes from the fact that SharePoint has so many different features. Different implementations of SharePoint will have different requirements based on the features used and how the organization uses them, so there is no single set of best practices. Often the answers in the book are “It depends.” (Hence the title of the book, “Storage Considerations“). This means that organizations implementing SharePoint still need to do a lot of up-front requirements gathering before design can even begin, and need experienced professionals to guide the process. However, with this book as a guide, implementers will be sure to ask stakeholders the right questions and have a better understanding of the pros, cons, and tradeoffs of the many choices available.
I did have one other, minor problem with the book. As an e-book, I wouldn’t have noticed. But I printed out a few pages to read while I was waiting for my car to get fixed. The page numbers for chapter four don’t match the page numbers at the end of chapter three, and caused a bit of confusion. A small amount of editorial attention could have resolved the issue, and of course begs the question of what other details were missed…. (See note below in the comments).
A big thanks to Left Hand Networks for sponsoring the book so that everyone can have access to the information free of charge.